170 research outputs found
The Generalized A* Architecture
We consider the problem of computing a lightest derivation of a global
structure using a set of weighted rules. A large variety of inference problems
in AI can be formulated in this framework. We generalize A* search and
heuristics derived from abstractions to a broad class of lightest derivation
problems. We also describe a new algorithm that searches for lightest
derivations using a hierarchy of abstractions. Our generalization of A* gives a
new algorithm for searching AND/OR graphs in a bottom-up fashion. We discuss
how the algorithms described here provide a general architecture for addressing
the pipeline problem --- the problem of passing information back and forth
between various stages of processing in a perceptual system. We consider
examples in computer vision and natural language processing. We apply the
hierarchical search algorithm to the problem of estimating the boundaries of
convex objects in grayscale images and compare it to other search methods. A
second set of experiments demonstrate the use of a new compositional model for
finding salient curves in images
Detection of visual defects in citrus fruits: multivariate image analysis vs graph image segmentation
¿The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-40261-6_28This paper presents an application of visual quality control in orange post-harvesting comparing two different approaches. These approaches correspond to two very different methodologies released in the area of Computer Vision. The first approach is based on Multivariate Image Analysis (MIA) and was originally developed for the detection of defects in random color textures. It uses Principal Component Analysis and the T2 statistic to map the defective areas. The second approach is based on Graph Image Segmentation (GIS). It is an efficient segmentation algorithm that uses a graph-based representation of the image and a predicate to measure the evidence of boundaries between adjacent regions. While the MIA approach performs novelty detection on defects using a trained model of sound color textures, the GIS approach is strictly an unsupervised method with no training required on sound or defective areas. Both methods are compared through experimental work performed on a ground truth of 120 samples of citrus coming from four different cultivars. Although the GIS approach is faster and achieves better results in defect detection, the MIA method provides less false detections and does not need to use the hypothesis that the bigger area in samples always correspond to the non-damaged areaLópez García, F.; Andreu García, G.; Valiente González, JM.; Atienza Vanacloig, VL. (2013). Detection of visual defects in citrus fruits: multivariate image analysis vs graph image segmentation. En Computer Analysis of Images and Patterns. Springer Verlag (Germany). 8047:237-244. doi:10.1007/978-3-642-40261-6S237244804
Superpixel quality in microscopy images: the impact of noise & denoising
Microscopy is a valuable imaging tool in various biomedical research areas. Recent developments have made high resolution acquisition possible within a relatively short time. State-of-the-art imaging equipment such as serial block-face electron microscopes acquire gigabytes of data in a matter of hours. In order to make these amounts of data manageable, a more data-efficient representation is required. A popular approach for such data efficiency are superpixels which are designed to cluster homogeneous regions without crossing object boundaries. The use of superpixels as a pre-processing step has shown significant improvements in making computationally intensive computer vision analysis algorithms more tractable on large amounts of data. However, microscopy datasets in particular can be degraded by noise and most superpixel algorithms do not take this artifact into account. In this paper, we give a quantitative and qualitative comparison of superpixels generated on original and denoised images. We show that several advanced superpixel techniques are hampered by noise artifacts and require denoising and parameter tuning as a pre-processing step. The evaluation is performed on the Berkeley segmentation dataset as well as on fluorescence and scanning electron microscopy data
A Q-Ising model application for linear-time image segmentation
A computational method is presented which efficiently segments digital
grayscale images by directly applying the Q-state Ising (or Potts) model. Since
the Potts model was first proposed in 1952, physicists have studied lattice
models to gain deep insights into magnetism and other disordered systems. For
some time, researchers have realized that digital images may be modeled in much
the same way as these physical systems (i.e., as a square lattice of numerical
values). A major drawback in using Potts model methods for image segmentation
is that, with conventional methods, it processes in exponential time. Advances
have been made via certain approximations to reduce the segmentation process to
power-law time. However, in many applications (such as for sonar imagery),
real-time processing requires much greater efficiency. This article contains a
description of an energy minimization technique that applies four Potts
(Q-Ising) models directly to the image and processes in linear time. The result
is analogous to partitioning the system into regions of four classes of
magnetism. This direct Potts segmentation technique is demonstrated on
photographic, medical, and acoustic images.Comment: 7 pages, 8 figures, revtex, uses subfigure.sty. Central European
Journal of Physics, in press (2010
Learning to Detect and Track Visible and Occluded Body Joints in a Virtual World
Multi-People Tracking in an open-world setting requires a special effort in precise detection. Moreover, temporal continuity in the detection phase gains more importance when scene cluttering introduces the challenging problems of occluded targets. For the purpose, we propose a deep network architecture that jointly extracts people body parts and associates them across short temporal spans. Our model explicitly deals with occluded body parts, by hallucinating plausible solutions of not visible joints. We propose a new end-to-end architecture composed by four branches (visible heatmaps, occluded heatmaps, part affinity fields and temporal affinity fields) fed by a time linker feature extractor. To overcome the lack of surveillance data with tracking, body part and occlusion annotations we created the vastest Computer Graphics dataset for people tracking in urban scenarios by exploiting a photorealistic videogame. It is up to now the vastest dataset (about 500.000 frames, almost 10 million body poses) of human body parts for people tracking in urban scenarios. Our architecture trained on virtual data exhibits good generalization capabilities also on public real tracking benchmarks, when image resolution and sharpness are high enough, producing reliable tracklets useful for further batch data association or re-id modules
A study of observation scales based on Felzenswalb-Huttenlocher dissimilarity measure for hierarchical segmentation
International audienceHierarchical image segmentation provides a region-oriented scale-space, i.e., a set of image segmentations at different detail levels in which the segmentations at finer levels are nested with respect to those at coarser levels. Guimarães et al. proposed a hierarchical graph based image segmentation (HGB) method based on the Felzenszwalb-Huttenlocher dissimilarity. This HGB method computes, for each edge of a graph, the minimum scale in a hierarchy at which two regions linked by this edge should merge according to the dissimilarity. In order to generalize this method, we first propose an algorithm to compute the intervals which contain all the observation scales at which the associated regions should merge. Then, following the current trend in mathematical morphology to study criteria which are not increasing on a hierarchy, we present various strategies to select a significant observation scale in these intervals. We use the BSDS dataset to assess our observation scale selection methods. The experiments show that some of these strategies lead to better segmentation results than the ones obtained with the original HGB method
On morphological hierarchical representations for image processing and spatial data clustering
Hierarchical data representations in the context of classi cation and data
clustering were put forward during the fties. Recently, hierarchical image
representations have gained renewed interest for segmentation purposes. In this
paper, we briefly survey fundamental results on hierarchical clustering and
then detail recent paradigms developed for the hierarchical representation of
images in the framework of mathematical morphology: constrained connectivity
and ultrametric watersheds. Constrained connectivity can be viewed as a way to
constrain an initial hierarchy in such a way that a set of desired constraints
are satis ed. The framework of ultrametric watersheds provides a generic scheme
for computing any hierarchical connected clustering, in particular when such a
hierarchy is constrained. The suitability of this framework for solving
practical problems is illustrated with applications in remote sensing
Re-Identification for Improved People Tracking
Re-identification is usually defined as the problem of deciding whether a person currently in the field of view of a camera has been seen earlier either by that camera or another. However, a different version of the problem arises even when people are seen by multiple cameras with overlapping fields of view. Current tracking algorithms can easily get confused when people come close to each other and merge trajectory fragments into trajectories that include erroneous identity switches. Preventing this means re-identifying people across trajectory fragments. In this chapter, we show that this can be done very effectively by formulating the problem as a minimum-cost maximum-flow linear program. This version of the re-identification problem can be solved in real-time and produces trajectories without identity switches. We demonstrate the power of our approach both in single- and multi-camera setups to track pedestrians, soccer players, and basketball players
Free-hand sketch synthesis with deformable stroke models
We present a generative model which can automatically summarize the stroke
composition of free-hand sketches of a given category. When our model is fit to
a collection of sketches with similar poses, it discovers and learns the
structure and appearance of a set of coherent parts, with each part represented
by a group of strokes. It represents both consistent (topology) as well as
diverse aspects (structure and appearance variations) of each sketch category.
Key to the success of our model are important insights learned from a
comprehensive study performed on human stroke data. By fitting this model to
images, we are able to synthesize visually similar and pleasant free-hand
sketches
Truncated Inference for Latent Variable Optimization Problems: Application to Robust Estimation and Learning
Optimization problems with an auxiliary latent variable structure in addition
to the main model parameters occur frequently in computer vision and machine
learning. The additional latent variables make the underlying optimization task
expensive, either in terms of memory (by maintaining the latent variables), or
in terms of runtime (repeated exact inference of latent variables). We aim to
remove the need to maintain the latent variables and propose two formally
justified methods, that dynamically adapt the required accuracy of latent
variable inference. These methods have applications in large scale robust
estimation and in learning energy-based models from labeled data.Comment: 16 page
- …